Metagenomic insights into microbial community structure and metabolism in alpine permafrost on the Tibetan Plateau

Permafrost, characterized by its frozen soil, serves as a unique habitat for diverse microorganisms. Understanding these microbial communities is crucial for predicting the response of permafrost ecosystems to climate change. However, large-scale evidence regarding stratigraphic variations in microbial profiles remains limited. Here, we analyze microbial community structure and functional potential based on 16S rRNA gene amplicon sequencing and metagenomic data obtained from an ∼1000 km permafrost transect on the Tibetan Plateau. We find that microbial alpha diversity declines but beta diversity increases down the soil profile. Microbial assemblages are primarily governed by dispersal limitation and drift, with the importance of drift decreasing but that of dispersal limitation increasing with soil depth. Moreover, genes related to reduction reactions (e.g., ferric iron reduction, dissimilatory nitrate reduction, and denitrification) are enriched in the subsurface and permafrost layers. In addition, microbial groups involved in alternative electron accepting processes are more diverse and contribute highly to community-level metabolic profiles in the subsurface and permafrost layers, likely reflecting the lower redox potential and more complicated trophic strategies for microorganisms in deeper soils. Overall, these findings provide comprehensive insights into large-scale stratigraphic profiles of microbial community structure and functional potentials in permafrost regions.

acetogenesis, and 160 genomes demonstrated a capacity to convert acetate into acetyl-CoA (Fig. 5b), showing that fermentation was an essential metabolic pathway for microbial energy acquisition strategies in permafrost ecosystems 3 .In terms of complex carbon catabolism, 158 genomes contained genes for Chitin degradation, 119 had genes for amylolytic enzymes, while 95 possessed genes for Cellulose degradation (Fig. 5b).
Regarding the nitrogen cycle, a substantial number of MAGs exhibited reduction capacity.Specifically, 60 MAGs showed potential for nitrous oxide reduction, 61 MAGs showed competence in nitrite reduction, 60 MAGs featured nitrite reduction to ammonia, and 45 MAGs displayed the capacity for nitrate reduction (Fig. 5b).With regard to other pathways, 273 MAGs had the ability to reduce Fe (Fig. 5b), indicating that Fe reduction may serve as a favorable terminal electron acceptor to fuel microbial anaerobic organic matter degradation 4 .

MAGs.
Based on the metabolic profiles and gene coverage of total 274 MAGs, we calculated metabolic weight scores (MW-score) for each biogeochemical cycling process.High contribution percentage indicates that the microbial group can better represent this function from both gene presence and abundance 44 .The results showed that amino acid utilization (MW-score = 8.9), fermentation (MW-score = 8.7), complex carbon degradation (MW-score = 7.6), and fatty acid degradation (MW-score = 7.3) were the most weighted heterotrophic metabolic pathways at the community level (Fig. 6a).The higher MW-scores implied that these pathways were essential for microbial metabolism and might play an important role in microbial energy acquisition.In addition, acetate oxidation (MW-score = 5.2), CO oxidation (MW-score = 5), aromatics degradation (MW-score = 4.7), and formate oxidation (MW-score = 5) also contributed highly to the energy acquisition of microorganisms (Fig. 6a).The MW-score results further indicated that sulfur oxidation (MW-score = 4.6) and Fe reduction (MW-score = 5.1) were substantial contributors to the metabolisms of the microbial community (Fig. 6a).Take together, these findings manifested the diverse metabolisms for microbial life in permafrost ecosystems.

Supplementary Note 3. The fractional contribution of microbial taxa to each function for all MAGs.
Based on the results of metabolic weight scores (MW-score) of total 274 MAGs, we calculated the contribution of different taxa to the metabolic weight of each functional pathway.Our results revealed that Actinobacteriota had a high contribution to most of the metabolic pathways at the community level, including most of the complex carbon oxidation, fermentation, and redox reactions involved in the nitrogen, sulfur, and iron metabolic pathways (Fig. 6a).The Actinobacteriota exhibit remarkable metabolic diversity 5 that enables them to contribute highly to the microbial metabolic profiles at the community level.Regarding other taxa, α-Proteobacteria have been documented to have a preference for carbon-rich soils 6,7 and, in line with these observations, we found them to be mainly involved in carbon decomposition processes (including methanol oxidation, formate oxidation, formaldehyde oxidation, aromatics degradation) and N2 fixation (Fig. 6a).Acidobacteriota, Chloroflexota, γ-Proteobacteria, and Desulfobacterota were the main microbial taxa making high contributions to redox reactions related to carbon, nitrogen, sulfur, and iron metabolisms (Fig. 6a).Among these microbial groups, Acidobacteria had high contributions to Nitrate reduction, Arsenate reduction, Selenate reduction, Thiosulfate disproportionation and Iron oxidation (Fig. 6a).Chloroflexota were important contributors to nitrite reduction (nirkS and octR) (Fig. 6a).γ-Proteobacteria were important contributors to microbial oxidation processes including methanotrophy, nitrite ammonification (nirBD), Sulfide oxidation, thiosulfate oxidation, iron oxidation and arsenite oxidation (Fig. 6a).
Desulfobacterota were mainly important for Wood-Ljungdahl pathway (carbon fixation), nitrite reduction, ammonia oxidation of nitrite (nrfADH), and sulfite reduction (Fig. 6a).Overall, these findings demonstrate the high degree of metabolic diversity of microorganisms in permafrost ecosystems.

Supplementary Note 4. The novelty of this study compared with earlier publications.
Although previous studies have explored microbial communities and functional potentials in permafrost soils, this study is innovative in three important respects.First, preceding investigations concerning permafrost microorganisms have predominantly been constrained to the site-specific scale (Supplementary Tables 3-4).Our study provided the first large-scale stratigraphic characteristics of microorganisms in permafrost regions.To our knowledge, only two studies have focused on microorganisms across the Pan-Arctic permafrost regions by synthesizing sequences data.One of these studies, conducted by Waldrop et al. 8 , concerned the most highly variable genes in permafrost deposit, and found that these genes were associated with energy metabolism and C-assimilation.The other study, performed by Vishnivetskaya et al. 9 , revealed that photosynthetic organisms in permafrost deposits were effective members of the re-assembled community after permafrost collapse.In contrast, the data set in our study was derived from systematic measurements along a ~1,000 km transect, and soil samples were collected from both the active layer (surface and subsurface layers) and the permafrost deposit.Based on these systematic measurements, combined with thorough data analysis, our study provided several new findings on permafrost microbes.1) We observed the lesser effects of environmental variables in permafrost layer comparing to the active layer, suggesting a weaker response of microbes to environment selection in permafrost deposit, which may be ascribed to their survival strategies such as dormancy 10 .2) We found that genes participating in reduction reactions (e.g., dissimilatory nitrate reduction, denitrification, ferric iron reduction, sulfide reduction, tetrathionate reduction) were enriched in permafrost deposit, implying that microbes colonizing permafrost soils possessed specific metabolic capabilities in spite of the harsh conditions.These findings advanced our understanding of microbial profiles in permafrost regions.
Second, current studies concerning microorganisms were mainly confined to high-latitude permafrost region, with limited evidence from high-altitude permafrost region (Supplementary Tables 3-4).In this study, we deciphered the microbial diversity, biogeographic patterns from both the taxonomic and phylogenetic points of view, and used the occupancy and specificity analysis method to uncover the specialist species in the surface, subsurface, and permafrost layers.We also employed the null model to explore the underlying assembly mechanisms of microbial communities in each layer.
Additionally, we unveiled the variations of microbial functional and metabolic attributes from genetic and genomic points of view.Based on these analyses, this study provides a comprehensive view into microbial communities and functional attributes in the largest permafrost region in the mid-and low latitudes of the world.

Supplementary Note 5. Metabolic weight scores (MW-score) calculation method.
To explore the microbial functional capacity at the community-scale level, we determined the metabolic weight score metric (MW-score) according to the following equation 2 : In Eq. ( 1), the variable MW denotes the MW-score.  corresponds to the particular function (f) under consideration, ranked at the i-th (i) position among all functions.  represents the n-th genome within the complete set of genomes.fn signifies the function ranked at the n-th position among all functions.Cg represents the coverage associated with a genome, while Sf signifies the binary state of presence (indicated as 1) or absence (indicated as 0) of a given function within that genome.
We then calculated the percentage contribution of each microbial phylum (the default taxonomic level setting) for each function as follows 2 : In Eq. ( 2     with a high correlation (Spearman's ρ 2 > 0.7) were removed to avoid redundancy 39 .

Figure S3
. The number of diurnal freeze-thaw events within one year over soil depth on the Tibetan Plateau.A diurnal freezing-thawing event, as defined by Baker and Ruschy 40 , is recorded when the daily minimum soil temperature drops below 0°C, and the daily maximum soil temperature reaches 0°C or higher.Soil temperature data were derived from seventeen sites on the Tibetan Plateau and were recorded by Yang et al. 41 , Wei et al. 42 , and our research group, respectively.Cumulative CO2 release in permafrost soils was measured via 400-day incubation at 5°C by Qin et al. 44 .CH4 production potentials of subsurface and permafrost soils was measured through 24 h incubation at 4°C using the subsurface and permafrost soil samples by Song et al. 45 .Nitrogen cycling processes of three soil layers were measured via 24 h incubation at 5°C using 15 N labeling by Mao et al. 43 .All these biogeochemical processes were measured using the same soil samples as in this study.Shaded area shows the 95% confidence interval.A general framework for bioinformatic analysis involved in this study.Amplicon data were processed using the UNOISE method, and the taxonomic and phylogenetic information was obtained based on the amplicon sequencing.Metagenomic data were first assembled to contigs, genes were predicted and annotated at the contigs level.The assembled contigs were then further binned to metagenome-assembled genomes (MAGs), and the taxonomic and metabolic profiles were annotated using GTDB-tk v2.

Figure S1 .
Figure S1.Profile of microbial abundance and composition of dominant phyla

Figure S2 .
Figure S2.Clustering analysis for reducing the environmental variables

Figure S4 .
Figure S4.Profile of the functional genes among various soil layers.a The

Figure S5 .
Figure S5.The taxonomic distribution of the total 274 metagenome-assembled

Figure S6 .
Figure S6.The difference in sequencing and metagenome-assembled genomes

Figure S7 .
Figure S7.Differences in environmental variables among three soil layers.Variations of clay (a), silt (b), pH (c), soil moisture (d), soil organic carbon (SOC) (e), dissolved organic carbon (DOC) (f), labile carbon pool Ⅰ (LCP1, mainly polysaccharides) (g), labile carbon pool Ⅱ (LCP2, mostly cellulose) (h), recalcitrant carbon pool (RCP) (i), NH4 + -N (j), NO3 --N (k), and dissolved organic nitrogen (DON) (l) among three soil layers.Different lowercase letters in box plots indicate significant differences, which were determined by two-sided pairwise Wilcoxon test (n = 22).Different lowercase letters in bar plots indicate significant differences among three soil layers (P < 0.05).The error bars indicate the standard error of each variables.Soil moisture is defined as the percentage of water present in soil mass by its weight.SUR, surface layer; SUB, subsurface layer; PL, permafrost layer.Soil texture data were obtained from Mao et al. 43 , the remaining variables were determined by ourselves.Source data are provided as a Source Data file.

Figure S8 .
Figure S8.The presence of genes encoding fermentation on the metagenome-

Figure S9 .
Figure S9.The relationships between biogeochemical processes and gene relative

Figure S10 .
Figure S10.The comparison of human footprint (HF) index for our sampling sites

Figure S11.
Figure S11.A general framework for bioinformatic analysis involved in this study.Amplicon data were processed using the UNOISE method, and

Table S1 . Site information about the vegetation, coordinate location, and climatic characteristics.
11 Cprec denotes the percentage contribution of a microbial group to the MWscore.pj is the specific group (p) under investigation, ranked at the j-th (j) position Soil samples were collected at 24 sites along a ~1,000 km permafrost transect on the plateau.At two of these sites, there were problems obtaining a sufficient DNA yield during DNA extraction and so only samples from 22 sites were processed in this study.AI, aridity index, determined by dividing mean annual precipitation by mean annual potential evapotranspiration11, which was retrieved from the CGIAR-CSI Global-Aridity and Global-PET database (http://www.cgiar-csi.org).NDVI, Normalized Difference Vegetation Index.NDVI data were obtained from the Moderate Resolution Imaging Spectroradiometer (MODIS) aboard NASA's Terra satellites (http://neo.sci.gsfc.nasa.gov/)with ~1 km resolution for every 16-day interval between July to August in 2016 (when soil sampling was conducted).
among all groups.The variables gk and gl represent genomes ranked at the k-th (k) and l-th (l) positions, respectively, among all genomes.The additional notation gk...gl ∈ pj signifies that all genomes falling within this range are encompassed within the studied group pj.Both the MW-score and percentage contribution are determined via METABOLIC v4.0 2 .

Table S3 . Summary of studies concerning permafrost microorganisms based on meta-omics data
. "Measured" in the data source column means that data are measured with the same procedures, while "Synthesized" indicates that data are collected from